Overview

Dataset statistics

Number of variables29
Number of observations2240
Missing cells24
Missing cells (%)< 0.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory507.6 KiB
Average record size in memory232.1 B

Variable types

NUM15
BOOL7
CAT7

Reproduction

Analysis started2020-10-30 00:02:45.855908
Analysis finished2020-10-30 00:03:12.935569
Duration27.08 seconds
Versionpandas-profiling v2.8.0
Command linepandas_profiling --config_file config.yaml [YOUR_FILE.csv]
Download configurationconfig.yaml

Warnings

Z_CostContact has constant value "3" Constant
Z_Revenue has constant value "11" Constant
Dt_Customer has a high cardinality: 663 distinct values High cardinality
Income has 24 (1.1%) missing values Missing
ID has unique values Unique
Recency has 28 (1.2%) zeros Zeros
MntFruits has 400 (17.9%) zeros Zeros
MntFishProducts has 384 (17.1%) zeros Zeros
MntSweetProducts has 419 (18.7%) zeros Zeros
MntGoldProds has 61 (2.7%) zeros Zeros
NumDealsPurchases has 46 (2.1%) zeros Zeros
NumWebPurchases has 49 (2.2%) zeros Zeros
NumCatalogPurchases has 586 (26.2%) zeros Zeros

Variables

ID
Real number (ℝ≥0)

UNIQUE

Distinct count2240
Unique (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5592.159821428571
Minimum0
Maximum11191
Zeros1
Zeros (%)< 0.1%
Memory size17.5 KiB

Quantile statistics

Minimum0
5-th percentile576.85
Q12828.25
median5458.5
Q38427.75
95-th percentile10675.05
Maximum11191
Range11191
Interquartile range (IQR)5599.5

Descriptive statistics

Standard deviation3246.662198
Coefficient of variation (CV)0.5805739287
Kurtosis-1.190028038
Mean5592.159821
Median Absolute Deviation (MAD)2791
Skewness0.0398318728
Sum12526438
Variance10540815.43
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
25461< 0.1%
 
9671< 0.1%
 
53961< 0.1%
 
53941< 0.1%
 
74411< 0.1%
 
74371< 0.1%
 
53861< 0.1%
 
74331< 0.1%
 
74311< 0.1%
 
95001< 0.1%
 
Other values (2230)223099.6%
 
ValueCountFrequency (%) 
01< 0.1%
 
11< 0.1%
 
91< 0.1%
 
131< 0.1%
 
171< 0.1%
 
ValueCountFrequency (%) 
111911< 0.1%
 
111881< 0.1%
 
111871< 0.1%
 
111811< 0.1%
 
111781< 0.1%
 

Year_Birth
Real number (ℝ≥0)

Distinct count59
Unique (%)2.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1968.8058035714287
Minimum1893
Maximum1996
Zeros0
Zeros (%)0.0%
Memory size17.5 KiB

Quantile statistics

Minimum1893
5-th percentile1950
Q11959
median1970
Q31977
95-th percentile1988
Maximum1996
Range103
Interquartile range (IQR)18

Descriptive statistics

Standard deviation11.98406946
Coefficient of variation (CV)0.006086973858
Kurtosis0.7174644425
Mean1968.805804
Median Absolute Deviation (MAD)9
Skewness-0.3499438592
Sum4410125
Variance143.6179207
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
1976894.0%
 
1971873.9%
 
1975833.7%
 
1972793.5%
 
1978773.4%
 
1970773.4%
 
1973743.3%
 
1965743.3%
 
1969713.2%
 
1974693.1%
 
Other values (49)146065.2%
 
ValueCountFrequency (%) 
18931< 0.1%
 
18991< 0.1%
 
19001< 0.1%
 
19401< 0.1%
 
19411< 0.1%
 
ValueCountFrequency (%) 
199620.1%
 
199550.2%
 
199430.1%
 
199350.2%
 
1992130.6%
 

Education
Categorical

Distinct count5
Unique (%)0.2%
Missing0
Missing (%)0.0%
Memory size17.5 KiB
Graduation
1127
PhD
486
Master
370
2n Cycle
 
203
Basic
 
54
ValueCountFrequency (%) 
Graduation112750.3%
 
PhD48621.7%
 
Master37016.5%
 
2n Cycle2039.1%
 
Basic542.4%
 

Length

Max length10
Median length10
Mean length7.51875
Min length3

Marital_Status
Categorical

Distinct count8
Unique (%)0.4%
Missing0
Missing (%)0.0%
Memory size17.5 KiB
Married
864
Together
580
Single
480
Divorced
232
Widow
 
77
Other values (3)
 
7
ValueCountFrequency (%) 
Married86438.6%
 
Together58025.9%
 
Single48021.4%
 
Divorced23210.4%
 
Widow773.4%
 
Alone30.1%
 
Absurd20.1%
 
YOLO20.1%
 

Length

Max length8
Median length7
Mean length7.073214286
Min length4

Income
Real number (ℝ≥0)

MISSING

Distinct count1974
Unique (%)89.1%
Missing24
Missing (%)1.1%
Infinite0
Infinite (%)0.0%
Mean52247.25135379061
Minimum1730.0
Maximum666666.0
Zeros0
Zeros (%)0.0%
Memory size17.5 KiB

Quantile statistics

Minimum1730
5-th percentile18985.5
Q135303
median51381.5
Q368522
95-th percentile84130
Maximum666666
Range664936
Interquartile range (IQR)33219

Descriptive statistics

Standard deviation25173.07666
Coefficient of variation (CV)0.4818067173
Kurtosis159.6366996
Mean52247.25135
Median Absolute Deviation (MAD)16557.5
Skewness6.763487373
Sum115779909
Variance633683788.6
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
7500120.5%
 
3586040.2%
 
6744530.1%
 
3992230.1%
 
1869030.1%
 
8384430.1%
 
3776030.1%
 
4609830.1%
 
6384130.1%
 
8013430.1%
 
Other values (1964)217697.1%
 
(Missing)241.1%
 
ValueCountFrequency (%) 
17301< 0.1%
 
24471< 0.1%
 
35021< 0.1%
 
40231< 0.1%
 
44281< 0.1%
 
ValueCountFrequency (%) 
6666661< 0.1%
 
1623971< 0.1%
 
1608031< 0.1%
 
1577331< 0.1%
 
1572431< 0.1%
 

Kidhome
Categorical

Distinct count3
Unique (%)0.1%
Missing0
Missing (%)0.0%
Memory size17.5 KiB
0
1293
1
899
2
 
48
ValueCountFrequency (%) 
0129357.7%
 
189940.1%
 
2482.1%
 

Length

Max length1
Median length1
Mean length1
Min length1

Teenhome
Categorical

Distinct count3
Unique (%)0.1%
Missing0
Missing (%)0.0%
Memory size17.5 KiB
0
1158
1
1030
2
 
52
ValueCountFrequency (%) 
0115851.7%
 
1103046.0%
 
2522.3%
 

Length

Max length1
Median length1
Mean length1
Min length1

Dt_Customer
Categorical

HIGH CARDINALITY

Distinct count663
Unique (%)29.6%
Missing0
Missing (%)0.0%
Memory size17.5 KiB
2012-08-31
 
12
2014-05-12
 
11
2012-09-12
 
11
2013-02-14
 
11
2014-05-22
 
10
Other values (658)
2185
ValueCountFrequency (%) 
2012-08-31120.5%
 
2014-05-12110.5%
 
2012-09-12110.5%
 
2013-02-14110.5%
 
2014-05-22100.4%
 
2013-08-20100.4%
 
2012-10-2990.4%
 
2014-03-2390.4%
 
2014-04-0590.4%
 
2013-01-0290.4%
 
Other values (653)213995.5%
 

Length

Max length10
Median length10
Mean length10
Min length10

Recency
Real number (ℝ≥0)

ZEROS

Distinct count100
Unique (%)4.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean49.109375
Minimum0
Maximum99
Zeros28
Zeros (%)1.2%
Memory size17.5 KiB

Quantile statistics

Minimum0
5-th percentile4
Q124
median49
Q374
95-th percentile94
Maximum99
Range99
Interquartile range (IQR)50

Descriptive statistics

Standard deviation28.96245281
Coefficient of variation (CV)0.5897540502
Kurtosis-1.201896799
Mean49.109375
Median Absolute Deviation (MAD)25
Skewness-0.001986658634
Sum110005
Variance838.8236727
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
56371.7%
 
54321.4%
 
30321.4%
 
46311.4%
 
49301.3%
 
92301.3%
 
65301.3%
 
71291.3%
 
3291.3%
 
29291.3%
 
Other values (90)193186.2%
 
ValueCountFrequency (%) 
0281.2%
 
1241.1%
 
2281.2%
 
3291.3%
 
4271.2%
 
ValueCountFrequency (%) 
99170.8%
 
98221.0%
 
97200.9%
 
96251.1%
 
95190.8%
 

MntWines
Real number (ℝ≥0)

Distinct count776
Unique (%)34.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean303.9357142857143
Minimum0
Maximum1493
Zeros13
Zeros (%)0.6%
Memory size17.5 KiB

Quantile statistics

Minimum0
5-th percentile3
Q123.75
median173.5
Q3504.25
95-th percentile1000
Maximum1493
Range1493
Interquartile range (IQR)480.5

Descriptive statistics

Standard deviation336.5973926
Coefficient of variation (CV)1.107462456
Kurtosis0.5987435935
Mean303.9357143
Median Absolute Deviation (MAD)164.5
Skewness1.175770564
Sum680816
Variance113297.8047
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
2421.9%
 
5401.8%
 
6371.7%
 
1371.7%
 
4331.5%
 
8301.3%
 
3301.3%
 
9281.2%
 
12251.1%
 
10241.1%
 
Other values (766)191485.4%
 
ValueCountFrequency (%) 
0130.6%
 
1371.7%
 
2421.9%
 
3301.3%
 
4331.5%
 
ValueCountFrequency (%) 
14931< 0.1%
 
149220.1%
 
14861< 0.1%
 
147820.1%
 
14621< 0.1%
 

MntFruits
Real number (ℝ≥0)

ZEROS

Distinct count158
Unique (%)7.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean26.302232142857143
Minimum0
Maximum199
Zeros400
Zeros (%)17.9%
Memory size17.5 KiB

Quantile statistics

Minimum0
5-th percentile0
Q11
median8
Q333
95-th percentile123
Maximum199
Range199
Interquartile range (IQR)32

Descriptive statistics

Standard deviation39.77343376
Coefficient of variation (CV)1.51216952
Kurtosis4.050976251
Mean26.30223214
Median Absolute Deviation (MAD)8
Skewness2.102063305
Sum58917
Variance1581.926033
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
040017.9%
 
11627.2%
 
21205.4%
 
31165.2%
 
41044.6%
 
7673.0%
 
5652.9%
 
6622.8%
 
12502.2%
 
8482.1%
 
Other values (148)104646.7%
 
ValueCountFrequency (%) 
040017.9%
 
11627.2%
 
21205.4%
 
31165.2%
 
41044.6%
 
ValueCountFrequency (%) 
19920.1%
 
1971< 0.1%
 
19430.1%
 
19320.1%
 
1901< 0.1%
 

MntMeatProducts
Real number (ℝ≥0)

Distinct count558
Unique (%)24.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean166.95
Minimum0
Maximum1725
Zeros1
Zeros (%)< 0.1%
Memory size17.5 KiB

Quantile statistics

Minimum0
5-th percentile4
Q116
median67
Q3232
95-th percentile687.1
Maximum1725
Range1725
Interquartile range (IQR)216

Descriptive statistics

Standard deviation225.7153725
Coefficient of variation (CV)1.351993846
Kurtosis5.516724101
Mean166.95
Median Absolute Deviation (MAD)59
Skewness2.083233113
Sum373968
Variance50947.42939
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
7532.4%
 
5502.2%
 
11492.2%
 
8462.1%
 
6431.9%
 
3401.8%
 
10401.8%
 
9381.7%
 
16361.6%
 
12351.6%
 
Other values (548)181080.8%
 
ValueCountFrequency (%) 
01< 0.1%
 
1140.6%
 
2301.3%
 
3401.8%
 
4301.3%
 
ValueCountFrequency (%) 
172520.1%
 
16221< 0.1%
 
16071< 0.1%
 
15821< 0.1%
 
9841< 0.1%
 

MntFishProducts
Real number (ℝ≥0)

ZEROS

Distinct count182
Unique (%)8.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean37.52544642857143
Minimum0
Maximum259
Zeros384
Zeros (%)17.1%
Memory size17.5 KiB

Quantile statistics

Minimum0
5-th percentile0
Q13
median12
Q350
95-th percentile168.05
Maximum259
Range259
Interquartile range (IQR)47

Descriptive statistics

Standard deviation54.6289794
Coefficient of variation (CV)1.45578493
Kurtosis3.096460912
Mean37.52544643
Median Absolute Deviation (MAD)12
Skewness1.919768971
Sum84057
Variance2984.325391
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
038417.1%
 
21567.0%
 
31305.8%
 
41084.8%
 
6823.7%
 
7662.9%
 
8582.6%
 
10552.5%
 
13482.1%
 
12472.1%
 
Other values (172)110649.4%
 
ValueCountFrequency (%) 
038417.1%
 
1100.4%
 
21567.0%
 
31305.8%
 
41084.8%
 
ValueCountFrequency (%) 
2591< 0.1%
 
25830.1%
 
2541< 0.1%
 
2531< 0.1%
 
25030.1%
 

MntSweetProducts
Real number (ℝ≥0)

ZEROS

Distinct count177
Unique (%)7.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean27.06294642857143
Minimum0
Maximum263
Zeros419
Zeros (%)18.7%
Memory size17.5 KiB

Quantile statistics

Minimum0
5-th percentile0
Q11
median8
Q333
95-th percentile126
Maximum263
Range263
Interquartile range (IQR)32

Descriptive statistics

Standard deviation41.28049849
Coefficient of variation (CV)1.525351225
Kurtosis4.376548261
Mean27.06294643
Median Absolute Deviation (MAD)8
Skewness2.136080712
Sum60621
Variance1704.079555
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
041918.7%
 
11617.2%
 
21285.7%
 
31014.5%
 
4823.7%
 
5652.9%
 
6642.9%
 
7572.5%
 
8562.5%
 
12452.0%
 
Other values (167)106247.4%
 
ValueCountFrequency (%) 
041918.7%
 
11617.2%
 
21285.7%
 
31014.5%
 
4823.7%
 
ValueCountFrequency (%) 
2631< 0.1%
 
2621< 0.1%
 
1981< 0.1%
 
1971< 0.1%
 
1961< 0.1%
 

MntGoldProds
Real number (ℝ≥0)

ZEROS

Distinct count213
Unique (%)9.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean44.021875
Minimum0
Maximum362
Zeros61
Zeros (%)2.7%
Memory size17.5 KiB

Quantile statistics

Minimum0
5-th percentile1
Q19
median24
Q356
95-th percentile165.05
Maximum362
Range362
Interquartile range (IQR)47

Descriptive statistics

Standard deviation52.16743891
Coefficient of variation (CV)1.185034461
Kurtosis3.55170925
Mean44.021875
Median Absolute Deviation (MAD)18
Skewness1.886105609
Sum98609
Variance2721.441683
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
1733.3%
 
4703.1%
 
3693.1%
 
5632.8%
 
12632.8%
 
2622.8%
 
0612.7%
 
6572.5%
 
7542.4%
 
10492.2%
 
Other values (203)161972.3%
 
ValueCountFrequency (%) 
0612.7%
 
1733.3%
 
2622.8%
 
3693.1%
 
4703.1%
 
ValueCountFrequency (%) 
3621< 0.1%
 
3211< 0.1%
 
2911< 0.1%
 
2621< 0.1%
 
2491< 0.1%
 

NumDealsPurchases
Real number (ℝ≥0)

ZEROS

Distinct count15
Unique (%)0.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.325
Minimum0
Maximum15
Zeros46
Zeros (%)2.1%
Memory size17.5 KiB

Quantile statistics

Minimum0
5-th percentile1
Q11
median2
Q33
95-th percentile6
Maximum15
Range15
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.932237501
Coefficient of variation (CV)0.8310698928
Kurtosis8.936914321
Mean2.325
Median Absolute Deviation (MAD)1
Skewness2.418569388
Sum5208
Variance3.73354176
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
197043.3%
 
249722.2%
 
329713.3%
 
41898.4%
 
5944.2%
 
6612.7%
 
0462.1%
 
7401.8%
 
8140.6%
 
980.4%
 
Other values (5)241.1%
 
ValueCountFrequency (%) 
0462.1%
 
197043.3%
 
249722.2%
 
329713.3%
 
41898.4%
 
ValueCountFrequency (%) 
1570.3%
 
1330.1%
 
1240.2%
 
1150.2%
 
1050.2%
 

NumWebPurchases
Real number (ℝ≥0)

ZEROS

Distinct count15
Unique (%)0.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.084821428571429
Minimum0
Maximum27
Zeros49
Zeros (%)2.2%
Memory size17.5 KiB

Quantile statistics

Minimum0
5-th percentile1
Q12
median4
Q36
95-th percentile9
Maximum27
Range27
Interquartile range (IQR)4

Descriptive statistics

Standard deviation2.778714147
Coefficient of variation (CV)0.680253518
Kurtosis5.703128364
Mean4.084821429
Median Absolute Deviation (MAD)2
Skewness1.382794296
Sum9150
Variance7.721252313
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
237316.7%
 
135415.8%
 
333615.0%
 
428012.5%
 
52209.8%
 
62059.2%
 
71556.9%
 
81024.6%
 
9753.3%
 
0492.2%
 
Other values (5)914.1%
 
ValueCountFrequency (%) 
0492.2%
 
135415.8%
 
237316.7%
 
333615.0%
 
428012.5%
 
ValueCountFrequency (%) 
2720.1%
 
251< 0.1%
 
231< 0.1%
 
11442.0%
 
10431.9%
 

NumCatalogPurchases
Real number (ℝ≥0)

ZEROS

Distinct count14
Unique (%)0.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.6620535714285714
Minimum0
Maximum28
Zeros586
Zeros (%)26.2%
Memory size17.5 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median2
Q34
95-th percentile9
Maximum28
Range28
Interquartile range (IQR)4

Descriptive statistics

Standard deviation2.923100656
Coefficient of variation (CV)1.098062296
Kurtosis8.047436789
Mean2.662053571
Median Absolute Deviation (MAD)2
Skewness1.880988778
Sum5963
Variance8.544517442
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
058626.2%
 
149722.2%
 
227612.3%
 
31848.2%
 
41828.1%
 
51406.2%
 
61285.7%
 
7793.5%
 
8552.5%
 
10482.1%
 
Other values (4)652.9%
 
ValueCountFrequency (%) 
058626.2%
 
149722.2%
 
227612.3%
 
31848.2%
 
41828.1%
 
ValueCountFrequency (%) 
2830.1%
 
221< 0.1%
 
11190.8%
 
10482.1%
 
9421.9%
 

NumStorePurchases
Real number (ℝ≥0)

Distinct count14
Unique (%)0.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.790178571428571
Minimum0
Maximum13
Zeros15
Zeros (%)0.7%
Memory size17.5 KiB

Quantile statistics

Minimum0
5-th percentile2
Q13
median5
Q38
95-th percentile12
Maximum13
Range13
Interquartile range (IQR)5

Descriptive statistics

Standard deviation3.250958146
Coefficient of variation (CV)0.5614607746
Kurtosis-0.6220482771
Mean5.790178571
Median Absolute Deviation (MAD)2
Skewness0.7022372855
Sum12970
Variance10.56872886
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
349021.9%
 
432314.4%
 
222310.0%
 
52129.5%
 
61787.9%
 
81496.7%
 
71436.4%
 
101255.6%
 
91064.7%
 
121054.7%
 
Other values (4)1868.3%
 
ValueCountFrequency (%) 
0150.7%
 
170.3%
 
222310.0%
 
349021.9%
 
432314.4%
 
ValueCountFrequency (%) 
13833.7%
 
121054.7%
 
11813.6%
 
101255.6%
 
91064.7%
 

NumWebVisitsMonth
Real number (ℝ≥0)

Distinct count16
Unique (%)0.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.316517857142857
Minimum0
Maximum20
Zeros11
Zeros (%)0.5%
Memory size17.5 KiB

Quantile statistics

Minimum0
5-th percentile1
Q13
median6
Q37
95-th percentile8
Maximum20
Range20
Interquartile range (IQR)4

Descriptive statistics

Standard deviation2.42664501
Coefficient of variation (CV)0.4564350341
Kurtosis1.821613827
Mean5.316517857
Median Absolute Deviation (MAD)2
Skewness0.2079255568
Sum11909
Variance5.888606002
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
739317.5%
 
834215.3%
 
634015.2%
 
528112.5%
 
42189.7%
 
32059.2%
 
22029.0%
 
11536.8%
 
9833.7%
 
0110.5%
 
Other values (6)120.5%
 
ValueCountFrequency (%) 
0110.5%
 
11536.8%
 
22029.0%
 
32059.2%
 
42189.7%
 
ValueCountFrequency (%) 
2030.1%
 
1920.1%
 
171< 0.1%
 
1420.1%
 
131< 0.1%
 
Distinct count2
Unique (%)0.1%
Missing0
Missing (%)0.0%
Memory size17.5 KiB
0
2077
1
 
163
ValueCountFrequency (%) 
0207792.7%
 
11637.3%
 
Distinct count2
Unique (%)0.1%
Missing0
Missing (%)0.0%
Memory size17.5 KiB
0
2073
1
 
167
ValueCountFrequency (%) 
0207392.5%
 
11677.5%
 
Distinct count2
Unique (%)0.1%
Missing0
Missing (%)0.0%
Memory size17.5 KiB
0
2077
1
 
163
ValueCountFrequency (%) 
0207792.7%
 
11637.3%
 
Distinct count2
Unique (%)0.1%
Missing0
Missing (%)0.0%
Memory size17.5 KiB
0
2096
1
 
144
ValueCountFrequency (%) 
0209693.6%
 
11446.4%
 
Distinct count2
Unique (%)0.1%
Missing0
Missing (%)0.0%
Memory size17.5 KiB
0
2210
1
 
30
ValueCountFrequency (%) 
0221098.7%
 
1301.3%
 

Complain
Boolean

Distinct count2
Unique (%)0.1%
Missing0
Missing (%)0.0%
Memory size17.5 KiB
0
2219
1
 
21
ValueCountFrequency (%) 
0221999.1%
 
1210.9%
 

Z_CostContact
Categorical

CONSTANT
REJECTED

Distinct count1
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size17.5 KiB
3
2240
ValueCountFrequency (%) 
32240100.0%
 

Length

Max length1
Median length1
Mean length1
Min length1

Z_Revenue
Categorical

CONSTANT
REJECTED

Distinct count1
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size17.5 KiB
11
2240
ValueCountFrequency (%) 
112240100.0%
 

Length

Max length2
Median length2
Mean length2
Min length2

Response
Boolean

Distinct count2
Unique (%)0.1%
Missing0
Missing (%)0.0%
Memory size17.5 KiB
0
1906
1
 
334
ValueCountFrequency (%) 
0190685.1%
 
133414.9%
 

Interactions

Correlations

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

Sample

First rows

IDYear_BirthEducationMarital_StatusIncomeKidhomeTeenhomeDt_CustomerRecencyMntWinesMntFruitsMntMeatProductsMntFishProductsMntSweetProductsMntGoldProdsNumDealsPurchasesNumWebPurchasesNumCatalogPurchasesNumStorePurchasesNumWebVisitsMonthAcceptedCmp3AcceptedCmp4AcceptedCmp5AcceptedCmp1AcceptedCmp2ComplainZ_CostContactZ_RevenueResponse
055241957GraduationSingle58138.0002012-09-04586358854617288883810470000003111
121741954GraduationSingle46344.0112014-03-08381116216211250000003110
241411965GraduationTogether71613.0002013-08-21264264912711121421821040000003110
361821984GraduationTogether26646.0102014-02-1026114201035220460000003110
453241981PhDMarried58293.0102014-01-199417343118462715553650000003110
574461967MasterTogether62513.0012013-09-09165204298042142641060000003110
69651971GraduationDivorced55635.0012012-11-133423565164504927473760000003110
761771985PhDMarried33454.0102013-05-08327610563123240480000003110
848551974PhDTogether30351.0102013-06-061914024332130290000003111
958991950PhDTogether5648.0112014-03-1368280611131100201000003110

Last rows

IDYear_BirthEducationMarital_StatusIncomeKidhomeTeenhomeDt_CustomerRecencyMntWinesMntFruitsMntMeatProductsMntFishProductsMntSweetProductsMntGoldProdsNumDealsPurchasesNumWebPurchasesNumCatalogPurchasesNumStorePurchasesNumWebVisitsMonthAcceptedCmp3AcceptedCmp4AcceptedCmp5AcceptedCmp1AcceptedCmp2ComplainZ_CostContactZ_RevenueResponse
223070041984GraduationSingle11012.0102013-03-1682243267123331291000003110
223198171970MasterSingle44802.0002012-08-2171853101431310202941280000003110
223280801986GraduationSingle26816.0002012-08-1750516343100340000003110
223394321977GraduationTogether666666.0102013-06-0223914188112431360000003110
223483721974GraduationMarried34421.0102013-07-0181337629110270000003110
2235108701967GraduationMarried61223.0012013-06-13467094318242118247293450000003110
223640011946PhDTogether64014.0212014-06-1056406030008782570001003110
223772701981GraduationDivorced56981.0002014-01-2591908482173212241231360100003110
223882351956MasterTogether69245.0012014-01-248428302148030612651030000003110
223994051954PhDMarried52869.0112012-10-1540843612121331470000003111